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Abstract: We use a Stochastic Hybrid Automaton (SHA) model of prostate cancer evolution 
under intermittent androgen suppression (IAS) to study a threshold-based policy for therapy 
design. IAS is currently one of the most widely used treatments for advanced prostate cancer. 
Patients undergoing IAS are submitted to cycles of treatment (in the form of androgen 
deprivation) and off-treatment periods in an alternating manner. One of the main challenges 
in IAS is to optimally design a therapy scheme, i.e., to determine when to discontinue and 
recommence androgen suppression. The level of prostate specific antigen (PSA) in a patient’s 
serum is frequently monitored to determine when the patient will be taken off therapy and 
when therapy will resume. The threshold-based policy we propose is parameterized by lower 
and upper PSA threshold values and is associated with a cost metric that combines clinically 
relevant measures of therapy success. Using Infinitesimal Perturbation Analysis (IPA), we derive 
unbiased gradient estimators of this cost metric with respect to the controllable PSA threshold 
values based on actual data and show how these estimators can be used to adaptively adjust 
controllable parameters so as to improve therapy outcomes based on the cost metric defined. 
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1. INTRODUCTION 

Cancer is widely viewed as a “disease of stages” in which 
tumors must progress through a series of discrete “states” 
in order to ultimately become malignant [Hanahan and 
Weinberg (2011)]. This view is particularly suitable to 
the development of prostate cancer, which is known to 
be a multistep process [Longo et al. (2012)]. For instance, 
a patient diagnosed with localized prostate cancer who 
has had all the tumor surgically removed is considered 
to remain in the state of “localized disease” until he 
progresses to a new state. At each state, distinct therapies 
can be prescribed, and the time spent by the patient in any 
given state is a measure of the efficacy of the corresponding 
intervention. 

The standard treatment for advanced prostate cancer 
patients is hormone therapy in the form of androgen de¬ 
privation [Longo et al. (2012)]. The initial response to an¬ 
drogen deprivation therapy (ADT) is frequently positive, 
leading to a significant decrease in tumor size. However, 
most patients eventually develop resistance to ADT and 
relapse. A generally acceptable mechanism for explaining 
such relapse is the existence of an androgen-independent 
cancer cell phenotype that is resistant to secondary en¬ 
docrine therapy and whose outgrowth leads to tumor re¬ 
currence [Jackson (2004a)]. 

Intermittent androgen suppression (IAS) therapy has 
been recently proposed as a strategy for delaying or 
even preventing time to relapse. The purpose of IAS is 
to prevent the exisiting tumor from progressing into an 
androgen-independent state. In spite of significant clinical 
experience with this approach, defining ideal protocols for 
any given patient remains one of the main challenges as¬ 
sociated with effective IAS therapy [Hirata et al. (2010a)]. 
In fact, recent clinical trials suggest that the success of 
IAS ultimately translates into the ability to tailor on and 
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off-treatment schemes to individual patients [Bruchovsky 
et al. (2006), Bruchovsky et al. (2007)]. The design of op¬ 
timal personalized IAS treatment schemes remains, there¬ 
fore, an unsolved problem. 

Recent attempts at addressing this problem have led 
to the development of several mathematical models that 
explain the evolution of prostate cancer under hormone 
therapy. The model from Jackson (2004a) describes the 
growth of prostate tumors formed by two subpopulations 
of cancer cells, only one of which is sensitive to androgen 
deprivation, and successfully reproduces the experimen¬ 
tally observed three phases of tumor evolution; however, 
the issue of IAS therapy design was not explicitly ad¬ 
dressed. Ideta et al. (2008) applied a hybrid dynamical 
system approach to model prostate tumor evolution under 
IAS and then used it to study the effect of different therapy 
protocols on tumor growth and time to relapse through 
numerical and bifurcation analyses. Several extensions of 
the works by Jackson (2004a) and Ideta et al. (2008) have 
been proposed and we briefly review some of them. A non¬ 
linear model was developed by Shimada and Aihara (2008) 
to account for the competition between different cancer 
cell subpopulations, while Tao et al. (2010) proposed a 
model based on switched ordinary differential equations. 
The problem of individualized prostate cancer treatment 
was formulated as an optimal control problem for which 
a piecewise affine system model was developed by Suzuki 
et al. (2010). Hirata et al. (2010a) modeled the prostate 
tumor under IAS as a feedback control system for the pur¬ 
pose of patient classification, while Hirata et al. (20i0b) 
solved the model from Hirata et al. (2010a) analytically to 
derive conditions for patient relapse. 

Most of the exisiting models provide insights into the 
dynamics of prostate cancer evolution under ADT but do 
not address the issue of therapy design. Moreover, previous 
work focusing on classifying patients into groups in order 
to infer optimal treatment schemes have been based on 
more manageable, albeit less accurate, approaches to non¬ 
linear hybrid dynamical systems. In contrast, a nonlinear 





hybrid automaton model was recently proposed by Liu 
et al. (2015) and d-reachability analysis was used to iden¬ 
tify patient-specific therapy schemes. Although this model 
was shown to be in good agreement with published clinical 
data, it did not account for noise and fluctutations that 
are inherently associated with cell population dynamics 
and monitoring of clinical data. Stochastic effects were 
incorporated into a hybrid model of tumor growth under 
IAS therapy by Tanaka et al. (2010), but the ensuing anal¬ 
ysis was performed considering a pre-determined therapy 
scheme, i.e., no design of personalized therapy was carried 
out. This paper is motivated by the need to develop opti¬ 
mal personalized IAS therapy based on stochastic models 
of prostate cancer evolution and represents a first attempt 
towards this goal using a Stochastic Hybrid Automaton 
(SHA) model of cancer progression. 

In this paper, we draw upon the deterministic hybrid 
automaton model from Liu et al. (2015) to which we 
incorporate stochastic effects. We propose a cost metric 
in terms of the desired outcome of IAS therapy that is 
parameterized by a controllable parameter vector, and for¬ 
mulate the problem of optimal personalized therapy design 
as the search for the parameter values which minimize 
our cost metric. We use Infinitesimal Perturbation Anal¬ 
ysis (IPA) for hybrid systems [Cassandras et al. (2010)] 
to derive gradient estimates of the cost metric with re¬ 
spect to the controllable vector of interest, which can be 
subsequently incorporated into a standard gradient-based 
optimization algorithm. Our main focus is on establishing 
a framework for IPA applications to biologically-inspired 
problems which is illustrated here with the case of prostate 
cancer therapy design. The advantages of our approach 
are twofold: first, we can obtain sensitivity estimates with 
respect to the various model parameters from actual data 
so as to differentiate critical ones from others that are not. 
Moreover, IPA efficiently yields sensitivies with respect to 
controllable parameters in a therapy (i.e., control policy), 
which is arguably the ultimate goal of personalized therapy 
design. 

In Section 2 we formulate the problem of personalized 
therapy design based on an SHA model of prostate cancer 
evolution. Section 3 presents the general framework of IPA 
and details the derivation of IPA estimators for therapy 
evaluation and optimization. We include final remarks in 
Section 4. 

2. PROBLEM FORMULATION 
2.1 SHA Model of Prostate Caneer Progression 

The system we consider comprises a prostate tumor 
under lAS therapy, which is modeled as a Stochastic 
Hybrid Automaton (SHA). We adopt a standard SHA 
definition [Cassandras and Lafortune (2008)]: 

Gh = {Q, X, E, U, f, 4>, Inv, guard, p, go, xq) 
where Q is a set of discrete states; A is a continuous state 
space; A is a finite set of events; U is a set of admissible 
controls; / is a vector field, f : Q x X x U X; cj) is a 
discrete state transition function, (f) : Q x X x E ^ Q; Inv 
is a set defining an invariant condition (when this condition 
is violated at some q € Q, a transition must occur); guard 
is a set defining a guard condition, guard C Q x Q x X 
(when this condition is satisfied at some q G Q, a transition 
is allowed to occur); p is a reset function, p : Q x Q x X x 
A A; <70 is an initial discrete state; xq is an initial 
continuous state. 

With this framework in place, we define a SHA model 
of prostate cancer progression in terms of the following: 

I. Discrete state set Q. Hormone therapy for prostate 
cancer consists of administering medical agents that cause 
androgen suppression in an effort to decrease the popu¬ 
lation of prostate cancer cells and hence the size of the 


tumor. A common biomarker used to monitor the efficacy 
of such treatment is the serum level of Prostate-Specific 
Antigen (PSA), whose value provides an estimate of the 
size of the prostate cancer population. 

In IAS therapy, medication is suspended when a suffi¬ 
cient reduction in the size of the cancer cell populations 
is achieved. Since population sizes are not directly observ¬ 
able, this reduction is estimated in terms of the patient’s 
PSA level; hence, the patient goes off therapy once his PSA 
reaches a lower threshold value. Similarly, medication is re¬ 
instated once the cancer cell populations have significantly 
recovered, which corresponds to when the patient’s PSA 
level reaches an upper threshold value. Thus, we define 
Q = {q^^, where q^^ , respectively) is 

the on-treatment (off-treatment, respectively) operational 
mode of the system. 

2. State space A. The continuous state space A is de¬ 
fined in terms of the biomarkers commonly monitored dur¬ 
ing IAS therapy, namely the PSA level and the androgen 
concentration in the patient’s serum. We assume the co¬ 
existence of two subpopulations of cancer cells within the 
tumor: Hormone Sensitive Cells (HSCs) and Castration 
Resistant Cells (CRCs). The proliferation of the former is 
negatively affected by hormone therapy, while the survival 
rate of the latter decreases in androgen-rich environments. 

We thus define a state vector x{t) = [xi{t),X 2 {t),X 3 {t)] 
with Xi{t) G 1 R+, such that xi{t) is the total population 
of HSCs, X 2 {t) is the total population of CRCs, and 
X 3 (t) is the concentration of androgen in the serum. Since 
prostate cancer cells secrete high levels of PSA, it is 
frequently assumed that the serum PSA concentration can 
be modeled as a linear combination of the cancer cell 
subpopulations, i.e., ciXi{t) + C 2 X 2 (t). Another common 
assumption is that both HSCs and CRCs secrete PSA 
equivalently, so that ci = C 2 = I [Ideta et al. (2008)]. 
In this work, we adopt these assumptions. We also define 
a “clock” state variable Zi{t) G 1R+, i = 1,2, where 
zi{t) {z 2 {t), respectively) measures the time spent by the 
system in state qoN (qoFF, respectively). The clock is 
reset to zero once a state transition takes place. Setting 
z{f) = [ 2 ;i(t), Z 2 (I)], the complete state vector is [x{t),z{t)]. 

3. Event set E. We define the SHA event set as 

E = { 61 , 62 }, where 61 corresponds to the condi¬ 

tion [a;i(t) -|- X 2 it) = 9i from above] and 62 corresponds to 
[xi (t) -I- X 2 (t) = O 2 from below]. 

4. Admissible control set U. As described earlier, IAS 

therapy consists of cycles of androgen deprivation deliv¬ 
ered intermittently with off-treatment periods. The cycles 
of androgen deprivation are suspended when the patient’s 
PSA level reaches a lower threshold value, while ther¬ 
apy recommences once the PSA level reaches an upper 
threshold value. Hence, an IAS therapy can be viewed as 
a controlled process characterized by two parameters: 0 = 
[9i, 92 ] G 0, where 9i G 0™“] is the lower threshold 

value of the patient’s PSA level, and 02 G [0™'",0™“] is 
the upper threshold value of the patient’s PSA level, with 
0 ™ax ^ 9 ™™. At any time t, the feasible control set for 
the IAS therapy controller is U = {0,1} and the control is 
defined as: 


u{x{t),z{t)) 


0 if Xi{t) + X2(t) < 02, q{t) = q^^^ 

1 if xi {t) + X 2 {t) > 01 , q{t) = q^^ 


( 1 ) 

This is a simple form of hysteresis control to ensure that 
hormone therapy will be suspended whenever a patient’s 
PSA level drops below a minimum threshold value, and 
that therapy will resume whenever a patient’s PSA level 
reaches a maximum threshold value. 



5. System dynamics. The SHA system dynamics describe 
the evolution of continuous state variables over time, as 
well as the rules for discrete state transitions. First, the 
continuous (time-driven) dynamics capture the prostate 
cancer cell population dynamics, which are defined in 
terms of their proliferation, apoptosis, and conversion 
rates. Existing studies commonly use Michaelis-Menten- 
like functions to model the rates of proliferation and 
apoptosis [Ideta et al. (2008), Jackson (2004a), Jackson 
(2004b)]. Recently Liu et al. (2015) obtained greater con¬ 
sistency between clinical data and simulated population 
dynamics by modeling these rates using sigmoid functions. 
In what follows, we incorporate stochastic effects into the 
deterministic model from Liu et al. (2015) and obtain: 


xi{t) = ai 

-Pi 


2^ _|_ g-(a:3(t)-fci)fc2 


1 -b e 


-{X3{t)-k3)ki 


-1 


+Mi + Ci(^) 
X2(t) = 


TOi ( 1-) -b Al 

2 : 3.0 


• Xi(t) 
• Xi(t) 

■ Xi(t) 

, , X2(t) 

2 : 3.0 / J 

-bmi ( 1 - xi (t) -b C2 (t) 

^ 3.0 J 


a 2 (l-d^) -/32 


( 2 ) 


(3) 


2:3 {t) 


xajt) 

a 


+ M3 + Csit) 


2:3.0 - Xsit) 
a 


+ M3 + Csit) 


if xi{t) -b X 2 {t) > 9i 
and q{t) = 
if xi lt) -b 2 : 2(0 < ^2 
and qit) = q^^^ 

(4) 


zi{t) 


1 if qit) = q^^ 
0 otherwise 


ziit+) = 0 


if a:i (0 -b 2:2(0 = 
and qit) = q^^ 


Oi 


(5) 


i2(0 


1 iiqit)=q^^^ 
0 otherwise 


22 ( 1 +) = 0 


if a:i (0 + 2:2 (t) = 6>2 
and qit) = q^^^ 


( 6 ) 


where ai and 0:2 are the HSC proliferation constant and 
CRC proliferation constant, respectively; Pi and P 2 are 
the HSC apoptosis constant and CRC apoptosis con¬ 
stant, respectively; ki through ki are HSC proliferation 
and apoptosis exponential constants; mi is the HSC to 
CRC conversion constant; X 3 ^o corresponds to the patient- 
specihc androgen constant; u is the androgen degradation 
constant; Ai is the HSC basal degradation rate; ni and 
/i 3 are the HSC basal production rate and androgen basal 
production rate, respectively. Finally, {Ci(i)}, * = 1,2,3, 
are stochastic processes which we allow to have arbitrary 
characteristics and only assume them to be piecewise con¬ 
tinuous w.p. 1 . 

Observe that (2) and (3) seem to be independent of the 
discrete state (mode) qit); however, their dependence on 
x^it), whose dynamics are affected by mode transitions, 
implies that xi(t), 2:2 (t) also change due to such transi¬ 
tions. To make this behavior explicit, we can solve (4) 
for 2:3 (t) and substitute this solution into (2) and (3), as 
detailed next. 

Consider a sample path of the system over [0,T] and 
denote the time of occurrence of the fcth event (of any 
type) by Tfc(0). Since our complete system state vector is 


[xit), zit)], we shall denote the state dynamics over any 
interevent interval [Tfc(0),rfc+i(0)) as follows: 

Xnit) = fn,kit), Ziit) = n = 1,..., 3, i = 1, 2 

Although we include 9 as an argument in the expressions 
above to stress the dependence on the controllable param¬ 
eter, we will subsequently drop this for ease of notation as 
long as no confusion arises. 

We start our analysis by assuming qit) = q^^ for t G 
[Tfc,Tfc+i). It is clear from (4) that 2 : 3 (t) = — _l_ ^2 _(_ 

Ciit), which implies that, for t € [Tfc,Tfc+i), 

xsit) = 2:3(T^)e“(‘“'^'“^/'^ 

-be • f [/i 3 -b C3(e)] de 
J Tk 

For notational simplicity, let 

Ut)= f e-^^-^^/-Ue)de (7) 


and define, for t € [Tfc,rfc+i), 

h^^{t,Ut)) =2:3(r+)e-(‘--'=)/'^ 

-bM30-[l - -b (sit) 

Now let qit) = q^^^ for t G [T/c,r/c+i), so that (4) implies 
that, for t G [Tk,Tk+i), 


xsit) =X3iT))))e 

-b(M 3 CT -b 2:3,0)]! - -b (sit) 

Similarly as above, we define, for t G [t^jT^+i), 

-b(M30- + 2:3,0)]! - -b (sit) 

(9) 

It is thus clear that 


(tXsit)^ iiqit)=q^^ 
h^^^{t,Ut)) iiqit)=q^^^ 


Although we include (sit) as an argument in ( 8 )-(9) to 
stress the dependence on the stochastic process, we will 
subsequently drop this for ease of notation as long as no 
confusion arises. It is now clear that, by using (8)-(9) in 
(2)-(3), we may rewrite the state variable dynamics as 


xiit) 


and 


|ai [1-b <()Q^(t)] - Pi [lit)] 

fh^^it)\ , ,0 

+mi ( —--j - (mi -b Al) -xiit) 

-\-Hi(lit) iiqit) = qO^ 

(ai [1 -b P^^^it)] -Pi[l + P^^^it)] 

fhOPFit)\ , ^ I 

+mi ( —--j - (mi -b Al) ^ • Xiit) 

.+Mi+Ci(i) if 9(0 = 9 *^^^ 

( 10 ) 



X2{t) = < 


V 3^3,0 / 

+mi ( 1 - - - — ) Xi{t) + C2{t) 


X3,0 


a2 ( 1 - ) - (32 


+mi 1 — 


^3^0 

•ofP 


it) 


if q{t) = 

X2it) 


X3,0 


Xlit) + C2(f) 


if q{t) = q 


OFF 


with 




( 11 ) 


The discrete (event-driven) dynamics are dictated by 
the occurrence of events that cause state transitions. Based 
on the event set E = {61,62} we have defined, the 
occurrence of 61 results in a transition from q^^ to q^^^ 
and the occurrence of 62 results in a transition from q^^^ 
to q^’^. 


2.2 IAS Therapy Evaluation and Optimization 

The effect of an IAS therapy u (9, t) depends on the 
controllable parameter vector 9, as in (1), and can be 
quantified in terms of performance metrics of the form 
J[u(0,t)]. Evaluating J[u(0,t)] over all possible values 
of 9 is clearly infeasible. However, there exist very effi¬ 
cient ways to accomplish this goal for stochastic hybrid 
systems. In particular. Perturbation Analysis (PA) is a 
methodology to efficiently estimate the sensitivity of the 
performance with respect to 9, i.e., when 0 is a real-valued 
scalar, the derivative dJ/d9. This is accomplished by ex¬ 
tracting data from a sample path (simulated or actual) of 
the observed system based on which an unbiased estimate 
of dJ/d9 can indeed be obtained. The attractive feature 
of PA is that the resulting estimates are extracted from 
a single sample path in a non-intrusive manner and the 
computational cost of doing so is, in most cases of interest, 
minimal [Cassandras and Lafortune (2008)]. This is in 
contrast to the the conventional finite difference estimate 
of dJ/d9 obtained through [J{9 -I- A) — J(0)]/A. Thus, for 
a vector 9 of dimension N, estimating the gradient VJ (9) 
requires a single sample path (with some overhead) instead 
of A -I- I sample paths. The simplest family of PA estima¬ 
tors is Infinitesimal Perturbation Analysis (IPA). For the 
SHA model we consider here, IPA has been recently shown 
to provide unbiased gradient estimates [Cassandras et al. 
(2010)] for virtually arbitrary stochastic hybrid systems. 
Our goal, therefore, is to adapt IPA estimators of the 
form dJ[u{9,t)]/d9 for different therapies u{9,t), thus es¬ 
timating their effects, and to ultimately show that optimal 
therapy schemes can be designed by solving problems of 
the form min^ge •/[u(0,t)]. 

We define a sample function in terms of complementary 
measures of therapy success. In particular, we take the 
most adequate treatment schemes to be those that (i) 
ensure PSA levels are kept as low as possible; (ii) reduce 
the frequency of on and off-treatment cycles. There is an 
obvious trade-off between (i) and the cost associated with 
(ii), which is related to the duration of the therapy and 
could potentially include fixed set up costs incurred when 
therapy is reinstated. For simplicity, we disconsider fixed 


set up costs and take (ii) to be linearly proportional to 
the length of the on-treatment cycles. We thus associate 
weights Wi, i = 1,2, with (i) and (ii), respectively, and 
define our sample function as the sum of the average PSA 
level and the average duration of an on-treatment cycle 
over a fixed time interval [0,T]. We also normalize our 
sample function to ensure that the trade-off between (i) 
and (ii) is captured appropriately: we divide (i) by the 
value of the patient’s PSA level at the start of the first on- 
treatment cycle (PSAinit), and note that (ii) is naturally 
normalized by T. Recall that the total population size of 
prostate cancer cells is assumed to reflect the serum PSA 
concentration, and that we have defined clock variables 
which measure the time elapsed in each of the treatment 
modes, so that our sample function can be written as 


L{9,x{t)),z{t)),T)=- 


Wi 

' rT 
0 


xi(9,t) + X2(9,t) 


PSA, 


dt 


+ W 2 zi{t)dt 


( 12 ) 

where a;(0) and z(0) are given initial conditions. We can 
then define the overall performance metric as 


J (9, x{0), 0(0), T) = E[L [9, x(0), 0(0), T)] (13) 

Hence, the problem of determining the optimal IAS 
therapy can be formulated as 

mini:;[L(0,x(O),0(O),r)] (14) 


3. INFINITESIMAL PERTURBATION ANALYSIS 


Consider a sample path generated by the SHA over [0, T] 
and recall that we have defined Tki9) to be the time of 
occurrence of the fcth event (of any type), where 0 is a 
controllable parameter vector of interest. We denote the 
state and event time derivatives with respect to parameter 
9 as x'{9,t) = and t({9) = ^^2^, respectively, for 

fc = 1,..., A. As mentioned earlier, the dynamics of x{9, t) 
are fixed over any interevent interval [Tk{9),Tk+ii9)] and 
we write x{t) = fk (9, x, t) to represent the state dynamics 
over this interval. Although we include 9 as an argument 
in the expressions above to stress the dependence on the 
controllable parameter, we will subsequently drop this 
for ease of notation as long as no confusion arises. It is 
shown in Cassandras et al. (2010) that the state derivative 
satisfies 


with the following boundary condition: 


(15) 


^'i^k ) = ^'i'^k ) + [fk-iixu ) - fkir ^)] • x'k (16) 


We note that (16) is valid when x{9,t) is continuous in t 
at t = Tk- If this is not the case and the value of x{t^ ) is 
determined by the reset function p (q, q', x, e), then 


= ,17) 

du 

Knowledge of r} is, therefore, needed in order to evaluate 
(16). Following the framework in Cassandras et al. (2010), 
there are three types of events for a general stochas¬ 
tic hybrid system: (i) Exogenous Events. These events 
cause a discrete state transition independent of 9 and 
satisfy = 0. (ii) Endogenous Events. Such an event 
occurs at time Tk if there exists a continuously differ¬ 
entiable function : M" x 0 R such that = 
minjt > Tfc-i : gk (x{9,t),9) = 0}, where the function gk 
usually corresponds to a guard condition in a hybrid 



automaton. Taking derivatives with respect to 9, it is 
straighforward to obtain 

n -1 


dgk ^ , 

7 J K- 

dx 


L(T'fc ) 


de^ dx ^ ’ 


(18) 


as long as ■/fc-i(T’fc ) 7^ 0. (in) Induced Events. Such an 
event occurs at time if it is triggered by the occurrence 
of another event at time Tm < Tk (details can be found in 
Cassandras et al. (2010)). 

Returning to our problem of personalized prostate can¬ 
cer therapy design, we define the derivatives of the states 
Xn(9,t) and Zj{0,t) and event times Tk{0) with respect to 
Oi, i, j = 1, 2, n = 1,..., 3, as follows: 

^ dxn{9,t) ^ dzjie,t) 


de,. 


, _ dTk{9) 

^kA — 


dO) 


1,' it) = . 

dt ’ dxi ‘ 


-b 


dzi 


From (10), we have 
n = 2,3, i = 1, 2, and 


■'At) 

z[{t) -\ 


a/r 


dx2 
a ft'it) 






dz2 

a/r (0 (*) 


dXn 


dzi 


dfkAt), 


dxz 
dfkAt) 
d9, 

a/r (0 

d6i 


■’At) 


= 0 , 


— oii\A + {t)\ — I3i\i + {t)\ 


-1 


dxi 


—mi ( 1 — 


jt) 

X3,0 


— Al 


( 20 ) 

Combining the last two equations and solving for x( ^{t), 
we obtain 

x'i,iA) = t e [Tfe,Tfe+i) 

and, in particular, 


c'i,i(rfc+i) =x'i,i(r+)e^(^") 


( 21 ) 

( 22 ) 


with 


A (Tk) = [ 
J Tk 


fTk + l 


pTfc + i 

ai 

Pi 

Ark 

\E4>ON{t) 

1 + P2^(t)_ 


dt 


- I lALhON i^t) dt - (mi -b Al) (Tfc+i - Tk) 

J Tj^ ^3,0 

d (t) 

Similarly for X 2 {t), we have from (11) that 
= 0, i = 1,2, and 


dfkAt) hO^jt) 

dxi V 2:3,0 y /nql 

0x2 \ Xsfi J 

Combining the last two equations and solving for x '2 j (t) 
yields, for t e [rfc,rfc+i), 

A2,^(t) =a;2,i(2‘fc^)e®'^‘^ +B 2 {t,x[^,{TA),A(t)) (24) 

and, in particular, 

4.*(T-fc”+i) = -b B 2 {Tk,xA(TA),A(Tk)) 

(25) 

with 


dt 


(19) 

Our goal is to obtain an estimate of VJ(9) by evaluating 
the sample gradient VL(9), which is computed using data 
extracted from a sample path of the system (e.g., by 
simulating a sample path of our SHA model using clinical 
data). We will assume that the derivatives dL/dOi exist 
w.p. 1 for all 6i G K.+ . It is also easy to verify that L (9) is 
Lipschitz continuous for 9i G K"*". We will further assume 
that no two events can occur at the same time w.p.l. Under 
these conditions, it has been shown in Cassandras et al. 
(2010) that dL/d9i is an unbiased estimator of dJ/d9i, 
t = 1,2. 

In what follows, we derive the IPA state and event time 
derivatives for the events indentified in our SHA model of 
prostate cancer progression. 

3.1 State and Event Time Derivatives 

We begin by analyzing the state evolution considering 
each of the states (q'^^ and and events (ei and 62) 

dehned for our SHA model. 

1. q(t) = q^^ for t G [Tk,Tk+i). Using (15) for xi(t), we 
have, for i = 1, 2, 


r'^k+1 


!Tk 

L \ 2 : 3,0 J \ 


Bl (Tk) = ( 
d Tk 

H2(-) = e^i Gi{t,Tk)e-^^ 

ATh 




where Gi(t,Tfc) = mi (l - x[ i(T+)e^^*\ 


t G 


[Tk,Tk+l). 

In the ( 

df(Ht) _ dfl^it) 


Q i*3 

In the case of 0:3 (t), it is clear from (4) that — 


= 0, i = 1,2, and 


of(Ht) 


dzi ~ d9i — V, ^ g^^ — 

Substituting in (15) and solving for x'^^(t), for i = 1,2, 
yields x'^ ift) = 4 ^(Tjf)e-(‘-'"'=)/'^, t G [Tk,Tk+i), and, in 
particular, 

4..(2-,+i) = 4.i(2-,+ )e-(‘--)/'^ (26) 

Finally, in the case of the ’’clock” state variable Zi{9,t), 
i = 1, 2, based on (5) and (6), we have ~ ~ 

= 0, n = 1,...,3, i = 1,2, so that ^z'(t) = 0 

for t G [Tfc,Tfc+i). This means that the value of the state 
derivative of the ’’clock” variable remains unaltered while 
q{t) = i.e., zAt) = z'iirA), t G [Tk,Tk+i). 

2. q(t) = q^^^ for t G [Tfc,Tfe+i). Starting with a:i(t), a 
similar reasoning as above applies, but now we have 

<*)=„,[! + - A [1 + 


i9a;i 


2^3,0 


As a result, (15) implies that, for i = 1,2, 

X'l,i(t) = x'iATA)e^^*\ t G [Tk,Tk+l) 
and, in particular, 

2;M(Tfc+i) = 2;'i,,(T+)e'^(”'') 

with 


(27) 

(28) 


C{Tk)= f 

A rt 


r^k+i 


r'fk+i 

OC\ 

Pi 

A Tk 

i + po^^A) 

i + P^/^(t)_ 


dt 


J Tk ^3,0 

Similarly for X 2 (t), we have 


— (t) dt - (mi -b Al) (Tfc+i - Tk) 


dxi ^ V a;3,o 


dzi 


d9i 


dx2 \ X3fi 

It is thus straightforward to verify that (15) yields, for 
t G (rk 5 'Tfc+i), 



X2,iit) = + D 2 (t)) (29) 

and, in particular, 

4,i(T-fc"+i) = +D 2 (rfc,x'i_i(T^),C'(rfe)) 

(30) 


with 


Oi2 

Tk + l 


1-d- 


Di in) = [ ' 
D2{-)=e^^^^^n G2{t,Tk) 

J Tk 


(t) 


-P 2 

X3,0 

.-Dlirk)dt 


dt 


Jk _ 

dxi 


k _ 

dzi — 


where G 2 {t,Tk) = rrii (l - t 

['^k j ) ■ 

In the case of x^it), we will have 

Ofpit) ^ dftHt) 

gg. — 0, I — 1 , 2 , and 

implies x'^g^{t) = ^3 j(T+)e“(‘“'^'=)/'^, t e [Tfc,Tfc+i), and, 
in particular, 

(31) 


= —^, so that (15) 


Finally, in the case of the ’’clock” state variable Zi{9,t) 
1 = 1,2, based on (5) and ( 6 ), we have 
dflHt) 


dOi 


= 0 , n = 1 , 


,3, i = 1,2, so that 


dzi ~ 

Z'{t) = 0 


for t G [Tfc,Tfc+i). This means that the value of the state 
derivative of the ” clock” variable remains unaltered while 
q{t) = i.e., z'(t) = z{{t^), t G [Tfc,Tfc+i). 

3. A state transition from q^^ to q^^^ takes place at 
time Tfc. This necessarily implies that event ei occurred at 
time Tk. From (16) we have, for i = 1, 2, 

^lA'^k ) = ^lA'^k ) + [/r i'^k ) - fkU^^k)] • (32) 

Observe that, from (10), /^^(tjT) and fkfi{Tk) ulti¬ 
mately depend on (''’aT) (r^), respectively. 

Also, a transition from q^^ to q^^^ at time Tk implies 
that q{t) = q^^, t G [Tk-i,Tk) and q{t) = , t G 

[rfc,Tfc+i). Hence, evaluating (t^) from ( 8 ) over the 
appropriate time interval results in 

(rk) =a:3(r+_i)e-(--— 


+pi 3 a[l - e 


_ p-Gk-Tk-l) / O 


Csirk) 


and it follows directly from (9) that (r^) = 0:3 (r^). 

Furthermore, by continuity of Xn (t) (due to conservation 
of mass), Xn{T ^) = Xn{Tff), u = 1,2, 3. Also, since we have 
assumed that {0(^)1, * = 1,2,3, is piecewise continuous 
w.p.l and that no two events can occur at the same time 
w.p.l, Ci{T^) = Qirjf), i = 1,2,3. Hence, by evaluating 
(T-fc) = /r(T'fe”) - fkXii.Ti) we obtain 

A} {Tk, Cs {Tk)) = {«! [1 + 4>a^{Tk )] 


-ai [l + <^o^^(r+)]- [l + <^r(r,-)]- 

+ /33 [l + <^r^(r+)]-^ 


(33) 


mi 

3^3,0 


{^k ) -a;3(Tfc)] \ •xi(Tfc) 


Finally, the term r(.j, which corresponds to the event 
time derivative with respect to 9i at event time Tk, is 
determined using (18), as will be detailed in Lemma 1 
later. 

A similar analysis applies to X 2 {t), so that, for z = 1,2, 
4,* {Tk ) = 4.* {Tk ) + [/r (Tfc- )-fkli (r+)] • (34) 


where r(, ^ will be derived in Lemma 1, and fk^{Tjf) and 
fk+i{Tk) ultimately depend on (t^) and (Tk)^ 

respectively. Hence, evaluating (rfc) = fk^{Tk) — 

fklii^k) from ( 11 ) yields 

A/ {Tk, Cs {Tk)) = — [a;3(Tfc) - {t^ )] • X2{Tk) 

*^3,0 

(tjT) - a: 3 (Tfe)] • xi(Tfc) 

3^3,0 

(35) 

Finally, for X 3 {t), (16) can be easily seen to yield, for 

* = 1 , 2 , 


XsA^k) =A,i{Tk ) - 


X3,0 


Tk,i 


In the case of the ’’clock” state variable, Zi{t) is discon¬ 
tinuous in t at t = T/c, while Z 2 {t) is continuous. Applying 
(17) to the reset function defined in (5) yields z{ ^{t ^) = 0, 


z = 1 , 2 , i.e., the value of z[At) is reset to zero whenever 

event ei takes place. Based on (16) and ( 6 ), it is simple to 
verify that, for z = 1 , 2 , 


AiAk) = 4,i{^k ) -Tk,i 


4. A state transition from q‘^^^ to q^^ takes place at 
time Tk- This necessarily implies that event 62 occurred 
at time Tk. The same reasoning as above applies and from 
(16) we have, for z = 1 , 2 , 


x'l,i{Tk) = x'i/t^ ) -h [fl{T^ ) - fl+i{T+)] • tA (36) 


where fl{Tff) — fAi{T^) can be evaluated from ( 10 ) and 
ultimately depends on Ak) Ak)- Recall 

that a transition from q‘^^^ to q^^ at time Tk implies 
that q{t) = t G [Tk-i,Tk) and q{t) = q^^, t G 

[Tk,Tk+i). Hence, evaluating (r^) from (9) over the 

appropriate time interval results in 

h^^^Ak) =X3(r+i)e-fr-^'=-)/- 

+{fX3a + X3^o)[l - e + C3(Tfc) 


and it follows directly from (8) that Ak) = X3{Tk)- 
As in the previous case, continuity due to conservation 
of mass applies, so that evaluating A^Tk) = fAxif) — 
fl+i{T)A yields 

A}(Tfc,C3 (Tfc)) = |ai [1 -b (/>°^^(Tfc")] ^ 

- «i [1 + 0r(r+)] - /3i [1 + </>r"(-.-)]" ,,,, 

+ A[l + 0 g»(r+)]-‘ (3*) 

+— {rp} -3;3(Tt)]| .a,i(Tfc) 

3 : 3,0 J 


The term t (, ^ corresponds to the event time derivative 
with respect to 9i at event time Tk and its derivation will 
be detailed in Lemma 1. 

Similarly for X 2 {t), we have, for z = 1, 2, 


X2,iAk) = X2,iAk ) + [fk{Tk ) - fAlAk )] • A,i (38) 
where t (, ^ will be derived in Lemma 1. Evaluating 
A^(rfc) = /fc(Tfc") - /fe+i(r^) from (11), and making the 
appropriate simplifications due to continuity, we obtain 

A/(Tfe, C 3 {Tk)) = — [3;3(Tfe) - (tjT)] • X2{Tk) 

3 : 3,0 

Ak ) - X3{Tk)] ■ Xi{Tk) 

X3,0 

( 39 ) 



Finally, for X 3 {t), (16) can be easily seen to yield 




' k,i 


,* = 1 , 2 . 


In the case of the ’’clock” state variable, zi{t) is contin¬ 
uous in t at t = Tfc, while 2:2 (i) is discontinuous. Based on 
(16) and (5), it is simple to verify that, for i = 1, 2, 


z'lA^k) = ^lATk )-Tk,i 


Applying (17) to the reset function defined in ( 6 ) yields 
A,iAk) = 0 , * = 1 , 2 , i.e., the value of AAA is reset to 
zero whenever event 62 takes place. 

Note that, since = AA'^k)^ ^ ^ [Tk,Tk+i), we 

will have that ^'^(t^) = 2 :yi(r^_i), j,i = 1,2. Moreover, 
the sample path of our SHA consists of a sequence of 
alternating ei and 62 events, which implies that iA^) = 
0 if event ei occurred at r^-i, while A iA^) = 0 if event 
62 occurred at Tk-i- As a result. 


AA'' 


and 




—r'k j if event 62 occurs at 
0 otherwise 

—r'k j if event ei occurs at 
0 otherwise 


(40) 

(41) 


We now proceed with a general result which applies to 
all events defined for our SHA model. Let us denote the 
time of occurrence of the jth state transition by Tj, and 
define its derivative with respect to the control parameters 
as rj j = 1^, i = 1 , 2 . We also define /J" Aj) = ^A'^j), 
n = 1,..., 3 and note that at each state transition at time 
Tj an event Cp, p = 1 , 2 , will take place. 

Lemma 1. When an event Cp, p = 1, 2, occurs, the deriva¬ 
tive rj j, i = 1 , 2 , of state transition times Tj, j = 1 , 2 ,... 
with respect to the control parameters 9i, i = 1 , 2 , satisfies: 


1 [p = *] - AA-i) - AAi) 


3A 


(42) 


f!AA-) + f!AA-) 

where 1 [p = z] is the usual indicator function. 

Proof.We begin with an occurrence of event ei which 
causes a transition from state to state at time 

Tj. This implies that gjAA) = Xi + X 2 — 0i = 0. As a 

result, = 1 , = 0 , z = 1 , 2 , and 

= — 1, and it is simple to verify that (42) follows from 

(18)- 

Next, consider event 62 at time Tj, leading to a transition 
from state q^^^ to state q^^. In this case, gjA^ 0) = xi + 

. 7-0 _ Po — n on ftinf _ dgk _ i dgk _ dgk _ dgk _ n 

X 2 02 - U, so tnat - l, g^^ - g^_ - gg^ - U, 

z = 1,2, and ^ = —1. Substituting into (18) once again 
yields (42). ■ 

We note that the numerator in (42) is determined using 
(22) and (25) if qAf) = 01 (28) and (30) if qAf) ~ 

qOFF ^ Moreover, the denominator in (42) is computed 
using (lO)-(ll) and it is simple to verify that, if event 
ei takes place at time Tj, 

fAiAA) + fj-AA) = ai [^ + (t>2^Ai)] ^-xiAA 

,-i 


|/3i [l + (l)^^Aj )] 


0,2 


(r, 

1-d -^ 

a^3.o 



+ Cl ( 77 ') + AAj) 
and, if event 62 takes place at time Tj, 


fj-iAj ) + fjAAj ) = ai [1 + (Pa^^Aj )] ^ • xiAj) 

- {/3i [1+ C'°'^^('r“)] ^ +^ 1 } ■a;i(Tj) + pi 



( hOFPAiY 

\ 
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a2 I 1 — d - 

-/32 


V ^3,0 
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+ Ci(77') + C2Aj) 


This completes the derivation of all state and event time 
derivatives required to apply IPA to the hybrid automaton 
model of prostate cancer. In what follows, we shall derive 
the cost derivatives corresponding to the performance 
metric defined in ( 12 ). 


3.2 Cost Derivative 


Let us denote the total number of on and off-treatment 
periods (complete or incomplete) in [ 0 ,T] by Kt- Also let 
A denote the start of the period and pfc denote the end 
of the period (of either type). Finally, let Mt = 
be the total number of complete on-treatment periods, 
and 2APA denote the duration of the complete on- 
treatment period, where clearly 


Theorem 1. The derivative of the sample function L{9) 
with respect to the control parameters satisfies: 


dL{9) _ 

~ T 2-^ 


r'Hk 


dOi 
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(43) 


Kt Kt 
^ 2 ^ 2 


Cmt-I-I,* ’ A' ^Mt + A 


Proof.We assume, without loss of generality, that the 
start of our sample path will coincide with the start of 
the Hrst on-treatment period. Note also that we choose 
to end our sample path at time T, and that this choice 
is independent of 9i, i = 1,2. Consequently, we will have 

[ 0 ,T] = [AiVkAi which implies that = 0 , 

z = 1,2. Recall, from the definition of an intermittent 
hormone therapy scheme, that the sample path of our SHA 
will consist of alternating on and off-treatment periods. 
Since zi{t) = 0 when q{t) = q^^^, we can rewrite (12) as 


Li9,x{0),zi0),T) = ^y ( 
k=l -*7 


rVk 

'xi{9,t) - 1 - X2{9,t) 

Lk 

PSAinit 


dt 


/ z\{t)dt+ / zi{t)dt 

m=l''ik •> (.Mt + i 

(44) 

Note that our sample path can either (a) end with 
an incomplete on-treatment period, or (b) end with an 
incomplete off-treatment period. In (44), we assume that 
(a) holds, since (b) is a special case of (a) for which 

rji 

A zi(t)dt = ^ A"' Zi{t)dt. Observe that the end of 

m— 1 ^ 

an on-treatment period is coupled with the start of the 
subsequent off-treatment period, i.e., x, {ijk) = Xi (^fc+i), 
z = 1, 2, A: = 1,..., Kt — 1. Using this notation and taking 
the derivative of (44) yields 
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+ rj. 
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1 ■ ri>A^nU 
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^ [a;i(^_ffr)+ 2 ; 2 (CifT)] 
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Observe that the first two summation terms in (45) 
simplify to 

Kt — 1 


Y [ 2^1 (5fc+l) +2^2 (^fc+l)] 


fc=l 


9^fc+i 
90. 

I^T — l 

-Y [^i{ik)+x2m^. 

k=l * 

= [xi(a)+x2(a)]|| 

+ [a^i {^Kt ) + X2 {S,Kt )] 


(46) 


90. 

Further note that the sixth term in (45) cancels out with 
the second term on the right hand side of (46). Moreover, 
it is clear from (5) that = Z\{^'p) = 0 and 

= Vm-^m, TTi = 1,..., Mr■ Since z'^ pt) = Zj^^rp), 
j,i = 1 , 2 , over any interevent interval [rk^Tk+i), and 
recalling that ^ = 0, the last two terms in (45) 

simplify to 

Mt p 


VF 2 

rp / y 


^l,i(Cm) iVm Cm) + iVm Cm) 


dr]ry, 

90. 


VF 9 

+ —4..(C)|f, + l)(T’-CM, + l) 

Recall that Cm is the start of the mth on-treatment period, 
which necessarily corresponds to the m — 1 th occurrence 
of event 62 - Hence, it follows from (40) that 2 :) i(Cm) = 

m = 1,..., Mt+ 1 - As a result, (45) can be further 
simplified to 

fik+i 


dL{0) 

dOi 
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fc=i 
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[x'i,i{S,t)+x'2^i{0,t)\ dt 
[x'l,^{dA) Px'2^^{0,t)\ dt 
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T ■ PSAir^it 
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,m—l 


IF 2 
' T 


Wo . 

- TjT^Mt + I ('^ “ Cmt-Hi) 


(47) 


The result in (47) is obtained under the assumption that 
our sample path ends with an incomplete on-treatment 
period. This condition is satisfied when If 

this is not the case, i.e., if the sample path ends with an 
incomplete off-treatment period and the last 

term in (47) can be disregarded. It is then straightforward 
to verify that (47) can be rewritten as (43). ■ 

Observe that evaluating (43) requires knowledge of 
x[p0,t) and X 2 i{ 0 ,t) over all on and off-treatment pe¬ 
riods. Over on-treatment periods, this can be determined 
using (21) and (24), which ultimately depend on ( 8 ), so 
that it is necessary to evaluate the integral of the noise 
process C 3 (I)- In the case of off-treatment periods, (27) 
and (29) must be used, so that (9) must be evaluated, for 
which knowledge of the integral of the noise process 
is also needed. In the second and third terms in (43), 
can be computed using timers whose start and end times 
are observable events, while rj'^ ^ and Cm = I) • ■ • i AIt, 

and eventually CmtS-ii’ computed through (42), 

which requires knowledge of the noise processes Ci {t) and 
C^it) evaluated at event times only. 

4. CONCLUSION 

Biological systems are inherently sensitive to physiologic 
cues, such as the timing and dosage of drugs and related 
procedures. Hence, performing sensitivity analysis of the 
mathematical models that infer patient response to such 
cues will aid the development of personalized treatment 
schemes. Such sensitivity analysis should not only allow for 
evaluating the sensitivies with respect to model parame¬ 
ters, but also, and most importantly, provide sensitivity 
estimates with respect to controllable parameters in a 
therapy. The methodology we have laid out in this paper 
addresses both these needs. 

Indeed, this work is the first step towards the devel¬ 
opment of a methodology for personalized therapy design 
applicable to stochastic models of cancer progression. We 
illustrate our analysis with a case study of optimal IAS 
therapy design for prostate cancer. For such, we propose 
an SHA model to describe the evolution of prostate cancer 
under IAS therapy and derive a cost metric in terms of 
the desired outcome of IAS therapy that is parameterized 
by an appropriately chosen controllable parameter vector. 
The problem of optimal personalized therapy design is 
then formulated as the search for the parameter values 
which minimize our cost metric. In this context, we apply 
Infinitesimal Perturbation Analysis (IPA) and derive un¬ 
biased gradient estimates of the cost metric with respect 
to the controllable vector of interest, which can be used 
for sensitivity analysis of therapy schemes. 

More importantly, however, since (43) provides an unbi¬ 
ased estimate of dJ ( 0 ) / d0i , it can also be nltimately used 
for therapy estimation and optimization. To this end, it 
is possible to implement an algorithm for updating the 
value of dL{0)/d0i after each observed event. Such value 
can then be used to compute an optimal 0 * through an 
interative optimization procedure of the form Oi^i+i = 
di,i — PiHi^i {0i,x{O),T,uJi), where pi is the step size at 
the Ith iteration, I = 0 , 1 ,..., and w; denotes a sample 
path from which data are extracted and used to compute 
Hi^i {0i,x{O),T, uji) defined to be an estimate of dJ{0)/d0i. 
The sample paths can be generated through simulation of 
existing models (e.g., our SHA model), so that, by varying 
the model parameters, different patient behaviors can be 
analyzed. Alternatively, it is possible to apply our IPA 
estimators to real patient data obtained from clinical trials 
(available e.g., in Bruchovsky et al. (2006) and Bruchovsky 
et al. (2007)), and nltimately contrast the optimal therapy 



scheme 0* with the prescribed one. Our ongoing work 

involves implementing the IPA estimators derived in this 

work for personalized IAS therapy design. 
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